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of the Structural Proteins, mRNAs, and Genes of Coronaviruses 
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We propose a nomenclature to replace the various systems currently in use to designate coronavirus structural 
proteins, MRNAs, and genes/open reading frames. The nonstructural proteins have not been addressed. © 1990 Aca- 


demic Press, Inc. 


Several names are currently used to refer to each of 
the three or four structural proteins of coronaviruses, 
with corresponding and different acronyms. Similarly, 
the mRNAs (reviewed in Refs. (7) and (2}) have been 
referred to by numbers or by letters. Finally, the genes/ 
open reading frames (ORFs) have been designated by 
different authors in several different ways. 

To overcome the confusion thus created, the Coro- 
navirus Study Group of the Vertebrate Virus Subcom- 
mittee of the International Committee on Taxonomy of 
Viruses has reviewed the situation. At the Fourth Inter- 
national Symposium on Coronaviruses, held in July 
1989 at King’s College, Cambridge, England, the 
Group recommended a revised nomenclature to be 
used for all coronaviruses. The guidelines that have 
been formulated, if followed, will allow for newly identi- 
fied mRNAs, genes, or ORFs to be named without cre- 
ating confusion. The Study Group considered it inap- 
propriate to make recommendations regarding pro- 
teins that are believed to be nonstructural because at 
the moment information is limited. 


STRUCTURAL PROTEINS 


The recommended acronyms for the structural pro- 
teins are shown in Table 1. The virion proteins have 
recently been reviewed (2). 
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The large surface projection (spike or peplomer gly- 
coprotein) has previously been referred to as S (3) or E2 
(4). The acronym S may be used to denote the primary 
translation product and generally to refer to the spike 
glycoprotein. In some, though not all, coronaviruses, S 
is cleaved into two glycopolypeptides, the amino (N)- 
terminal S1 (E2B) and the carboxy (C)-terminal S2 (E2A) 
glycopolypeptides (the previous alternative acronyms 
are shown in parentheses (5)). The hemagglutinin—es- 
terase (HE) glycoprotein has frequently been referred 
to by its approximate molecular weight, i.e., about 
65,000 and 140,000 in its reduced and nonreduced 
forms, respectively, and more recently by the acro- 
nyms E3, H, and HA. Some of the coronaviruses do not 
possess a gene for HE, while some strains of others 
have an incomplete HE gene which is not expressed. 
Those coronaviruses which cause hemagglutination 
but do not have the HE protein (e.g., infectious bronchi- 
tis virus (IBV), transmissible gastroenteritis virus) have 
only poor hemagglutination activity. In contrast, those 
viruses with good hemagglutination activity do have 
the HE protein, although the presence of HE does not 
necessarily mean that the virus causes hemagglutina- 
tion (e.g., some strains of murine hepatitis virus (MHV)). 
The HE of MHV exhibits amino acid homology with the 
HEF, subunit of the influenza C virus surface glycopro- 
tein (6). It has been shown that the HE of bovine coro- 
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navirus is a receptor-destroying enzyme with acetyles- 
terase activity (7). 

The Study Group has recommended that the integral 
membrane glycoprotein, M (previously also designated 
E1 (4)), should not be referred to as a matrix protein. 
The structure of this protein is substantially different 
from that of the matrix proteins of para- and orthomyxo- 
viruses. The nucleocapsid (N) protein is sometimes re- 
ferred to as the nucleoprotein, with the same acronym. 


mRNAs 


mRNAs are to be referred to by numbers 1,2,3... 
in order of decreasing size. The genome-sized mRNA 
is therefore mRNA 1. Consequently, the mRNAs of IBV, 
previously denoted as F, E, D, C, B, and A (8), should 
henceforth be referred to as 1, 2, 3, 4, 5, and 6, respec- 
tively. 

Coronaviruses differ in the number of their subgeno- 
mic mRNAs. It is recognized that mRNAs of different 
coronaviruses encoding the same protein may have 
different numbers; e.g., the S glycoprotein of IBV and 
MHV is encoded by mRNAs 2 and 3, respectively. 
However, this helps to highlight a major difference. 

When a protein has a name and an acronym, €.g., 
spike S, the corresponding mRNA can be referred to as 
MRNA 2 (8). It might be useful to label figures in this 
way. On other occasions, one might refer to ‘‘mRNA 
2" or the S mRNA, as appropriate, depending on the 
context in which the MRNA is being discussed. 

When “‘new"’ mRNAs are identified, the use of num- 
bers should be continued. For example, in all strains of 
murine hepatitis virus examined mRNA 2 encodes, in 
the 5’ ORF, a 30,000 molecular weight protein. Some 
strains, e.g., MHV-JHM, have more recently been 
shown to have an additional mRNA of a size intermedi- 
ate between that of mRNA 2 and mRNA 3 (8) (9). This 
intermediate MRNA, which encodes HE, should be re- 
ferred to as ‘'mRNA 2-1."’ Adash (-), not a point, should 
be used. 


GENES/ORFs 


Coronavirus mRNAs form a 3’ coterminal nested set 
and the sequences that are absent from the next small- 
est MRNA are often called the ‘‘unique regions.” 
Therefore, with the exception of the smallest mRNA, all 
the mRNAs are structurally polycistronic. However, it 
is believed (see (2)) that only the unique regions are 


TABLE 1 


ACRONYMS FOR CORONAVIRUS STRUCTURAL PROTEINS 


Name Acronym 
Spike glycoprotein $ 
N-Terminal cleavage product $1 
C-Terminal cleavage product $2 
Hemagglutinin-esterase glycoprotein HE 
Integral membrane glycoprotein M 
Nucleocapsid protein N 


translationally active. These regions may contain one 
or more ORFs. 

Genes/ORFs are to be referred to by letters. When 
the corresponding protein has a name, the acronym 
(uppercase) should be used, e.g., S, HE. Otherwise, 
the gene/ORF should be referred to by the number of 
the mRNA, plus a letter (lowercase) when there is more 
than one ORF. For example, the unique region of 
mRNA3 of IBV has three ORFs, 3a, 3b, 3c. MHV-A59, 
in contrast to MHV-JHM, lacks a complete ORF encod- 
ing the HE protein and does not generate a mRNA for 
HE; consequently HE is not expressed. Thus, MHV- 
JHM has ORF 2a and gene HE (ORF 2b), and the corre- 
sponding ORFs of MHV-A59 are referred to as ORFs 
2a and 2b. 

If a newly investigated coronavirus is shown to have 
a genome organization very similar to that of an exten- 
sively studied coronavirus, the designations of the 
mRNAs and genes should be based as closely as pos- 
sible on those established for the previously studied 
virus. In this way the nomenclature will help to reflect 
similarities. i 
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